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ARRANGEMENT AND METHOD FOR .DETERMINING POSITIONS OF THE TEATS OF 
A MILKING ANIMAL 

* 

TECHNICAL FIELD OF THE INVENTION 

The present invention generally relates to dairy farm robot 
5 milking and to automatic attachment of teat cups related 
thereto - 

DESCRIPTION OF RELATED ART AND BACKGROUND OF THE INVENTION 

In a known milking system, wherein teat cups are automatically 
attached to the teats of a milking animal to be milked^ a robot 

10 arm with a gripper is provided to grip and hold teat cups during 
the attachment of the teat cups. A laser emitting laser light 
and a video camera provided to register laser light as reflected 
from the teats of the milking animal are mounted on the robot 
arm. By aid of a method known as laser tiriangulation, the 

15 positions of the teats can be calculated. . The movement of the 
robot arm can then be controlled in response to the calculated 
positions to be capable of finding the teats for teat cup 
attachments . 

A drawback of such a milking system is that the camera, while 
20 being moved close to the milking animal, is exposed to dirt and 
possibly physical contact with the milking animal since the 
milking animal can make sudden movements. Further, the video 
camera can only be in active mode to seek for the teats when the 
robot arm already has collected a teat cup and initiated a 
25 movement towards the teats since the camera is fixedly mounted 
on the robot arm. Still further, the video camera occupies a 
considerable space on the robot arm, which may IdLmit the use of 
the system. 
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In Research Disclosure, July 2002, p. 1162 is disclosed the use 
of an arrangement comprising a number of cameras fixedly mounted 
in the milking stall instead of using a video camera mounted on 
a movable robot arm. For instance, two or three video cameras 
5 can be mounted at each side of the milking animal to be milked, 
preferably on the walls of the milking stall or just outside 
thereof. Advantageously, the video cameras are directed 
diagonally upwards towards the region where the teats are when 
the milking animal has been positioned in the milking stall. A 

10 coit5)uter is provided for e.g. selecting two cameras., which 
together creates a stereoscopic image, which by means of image 
processing enables a substantially exact determination of the 
position of a teat. Two further cameras may be used to confirm 
the three-dimensional position of a teat. A robot arm is then 

15 moved to the teat based on the determined position. 

While such an arrangement has several advantages such as a 
faster determination of the teats of the milking animal, a 
smaller and lighter robot arm, possibilities to better protect 
the cameras from dirt and from being damaged by kicks from the 

20 milking animal, and capabilities to monitor the complete milking 
operation for e.g. detecting a teat cup falling off the teat, it 
may be difficult to obtain a sufficiently robust, accurate, 
precise, and fast implementation, which is capable of >. 
controlling the robot arm to obtain a sufficiently high number 

25 of correct teat cup attachments. 

SUMMARY OF THE INVENTION 

A milking environment is a very difficult environment to perform 
stereo vision measurements. The environment is non-clean and 
dirt may settle on camera lenses. Further the cows are moving. 
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and teats may not be visible to the cameras due to self- 
occlusion. 

Another problem arises since each cow's physiology differs; the 
udders of the cows may be located at quite different locations, 
5 which put limitations on the positions of the cameras. For 
instance, the area where the teats most likely are has been 
found by empirical studies to be at least 500 x 600 x 480 mm^ 
large. 

Another problem arises since both color and texture of the teats 
10 are similar to the surface of the udder, which means that teat 
detection will be an arduous task: the contrast is low and color 
filters are of .no use. The situation is even more complicated by 
the fact that the size, shape, color structure, morphological 
structure and texture may vary quite much from animal to animal. 

15 Accordingly, it is an object of the present invention to provide 
an arrangement and a method for determining positions of the 
teats of a milking animal in a milking system comprising a robot 
arm for automatically attaching teat cups to the teats of a 
milking animal when being located in a position to be milked, 

20 and a control device for controlling the movement of said robot 
arm -"bjased on dynamically determined positions of the teats of 
the milking animal, which arrangement and method are based on 
stereo vision and solve at least some of the problems of the 
prior art as set forward above. 

25 It is in this respect a particular object of the invention to 
provide such an arrangement and such a method, which use a 
stereoscopic calculation method based on repeatedly recorded 
pairs of images, wherein for each teat and for each pair of 
images, conjugate points in the pair of images for the 
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.stereoscopic calculation can be found easily, efficiently, and 
fastly. 

It is a further object of. the invention to provide such an 
arrangement and such a method, which are robust, effective, 
5 fast, precise, accurate, reliable, safe, easy to use, and of low 
cost. 

It is still a further object of the invention to provide such an 
arrangement and such a method, which are capable of obtaining a 
very high number of correct teat cup attachments • 

10 These objects among others are, according to the* present 
invention, attained by arrangements and methods as claimed in 
the appended patent claims. 

Further characteristics of the invention and advantages thereof, 
will be evident from the following detailed description of 
15 preferred embodiments of the present invention given hereinafter 

♦ 

and the accompanying Figs. 1-8, which are given by way of 
illustration only and are thus not limitative of the present 
invention . 

■ ■ . Q 

BRIEF DESCRIPTION OF THE DRAWINGS 

20 Figs. 1-4 display each schematically an automated milking 

:• ' » 

Station including an arrangement for determining positions of 
the teats of a milking animal according to a respective 
preferred embodiment of the present invention. Figs. 1, 3 and 4 
are perspective views, whether Fig. 2 is a top view. The milking 
25 station is only schematically outlined in Figs. 2-4. 

m 

Fig. 5 is a schematic diagram illustrating various image 
processing methods, among them methods included in further 
preferred embodiments of the present invention. 
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Figs. 6-7 are each a pair of images as -taken by a camera pair as 
comprised in any of the arrangements shown in Figs. 1-4 and 
illustrate yet a preferred embodiment of the invention. 

Fig. B is a side view of the lower part of a teat of a milking 
animal illustrating self -occlusion, a phenomenon which is 
compensated for by still a preferred embodiment of the 
invention • 

DETAILED DESCRIPTION OF EMBODIMENTS 

The outline of this description is as follows. Firstly, a 
milking system wherein arrangements and methods for determining 
positions of the teats of a cow according to the present 
invention may be implemented is overviewed. Thereafter, various 
camera arrangements as used in the invention are considered. The 
following four sections are mainly devoted to image processing. 
The first of these sections deals with image processing in 
general and teat detection in particular. Thereafter, 
stereoscopic calculation methods for determining teat positions 
are considered. The following section deals with various 
calibration methods used in the invention. Finally, image 
processing methods for obtaining further functionality are 
disclosed. ■ 

1. The milking system 

In Fig. 1 is shown a milking system or station 3 arranged for 
voluntary milking of freely walking animals such as e.g. cows, 
i.e. the animals enter the milking station 3 in order to be 
milked on a voluntary basis. The milking station 3 comprises an 
enclosure having an inlet gate 4 and an outlet gate 5, which are 
both capable of being opened automatically. The front end of the 
milking station 3 is denoted by 3a, the back end is denoted by 
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3b, the left side is denoted by 3c and the right .side is denoted 
by 3d. 

The milking station 3 comprises further an automatic milking 
machine (not explicitly illustrated) including teat cups 11 
5 connected to an end unit by means of milk lines (only the 
portions attached to the teat cups 11 are shown in Fig, 1). The 
milking station further includes a milking robot or automatic 
handling device 14 having a robot arm 15 provided with a C]- 
gripper. The milking robot 14 is arranged to automatically apply 
10 the teat cups 11 of the milking machine to the teats of a cow 8 

present in the milking station 3 prior to milking • In Fig. 1 J 
three of the teat cups 11 are arranged in a teat cup rack or 
magazine 16, .whereas the fourth one is held by the gripper of 
the robot arm 15. 

15 Typically, a teat cleaning device including e.g. a teat cleaning 
cup 21 or brushes 22 may be provided for cleaning the teats of 
the cow 8 prior to milking. 

Further, the milking station 3 comprises an identif icc^ytion 
member provided to identify a cow approaching the mill^ing 

20 station 3, and a central processing and control device 19, which 
is responsible for central processing and controlling of the 
animal arrangement, which inter alia includes the initiation of 
various activities in connection with the milking such as e.g. 
opening and closing of the gates 4 and 5, and control of the 

25 milking machine and its handling device 14. The central 
processing and control device 19 comprises typically a 
microcomputer, suitable software, and a database including 
information of each of the cows milked by the milking machine, 
such as e.g. when the respective cow was milked last time, when 

30 she was fed last time, her milk production, her health, etc. 
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.A cow approaching the .milking station is thus identified by the 
identification member, and the central processing and control 
device 19 may then, depending on the identification, give the 
cow access to the milking station 3 by means of opening the 
inlet gate 4, The teats may be cleaned by the teat cleaning 
device, after which the teat cups 19 are applied to the teats of 
the cow 8 in the milking station 9 under control of the central 
processing and control device 19. 

The attachment is enabled by means of locating the teats of the 
cow 8 by a camera pair 23 directed towards the teats of the cow 
8 when being located in the position to be milked. The camera 
pair 23 is provided to repeatedly record pairs of images and may 
for instance comprise two CCD or video cameras. The central 
processing and control device 19 or other control or image 
processing device detects repeatedly the teats of the cow 8 and 
determines their positions by a stereoscopic calculation method 
based on the repeatedly recorded pairs of images. This position 
information is used by the central processing and control device 
19 to send signals to the milking robot to move the robot arm to 
each teat after having gripped a respective teat cup. 

Note that the robot arm 5 also has to move the teat cleaning cup 
21 or the brushes 22 to the teats of the cow 8. This may be 
performed in the very same manner as the teat cups are moved. 

During milking, milk is drawn from the teats of the cow 8 by . 
means of vacuum being applied to the teat cups 11 via the milk 
lines, and the milk drawn is collected in the end unit. After 
the milking has been completed the teats of the cow may be 
subjected to after treatment, e.g. a spray of disinfectant, and 
then the outlet gate 5 is opened and the cow 8 may leave the 

« 

milking station 3. 
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2 . The camera arrangement 

The location and orientation of the camera pair are very 
important features. The task is to design an arrangement that 
should be capable of detecting and locating every teat for each 
cow. Firstly, all teats must be seen by the cameras used. 
Secondly, the teats must be visualized in a manner that makes 
the image analysis easier. The implementation should also strive 
to maintain a low-cost profile in order to enable potential 
industrial production at the same cost as the prior art systems 
used today. 

In the preferred embodiment of Fig. 1, the camera pair 23 is 
mounted below the teats of the cow 8 and behind the cow when 
being located in the position to be milked so that the camera 
pair 23 is directed diagonally upwards towards the teats of the 
cow 8 when being located in the position to be milked. The 
immediate position behind the cow 8 is in terms of image 
analysis a good location. The teats would be close to the 
cameras and by placing the cameras below the udders, directed 
diagonally upwards, both high and low udders can be detected in 
most cases. A typical distance between the camera pair 23 and 
the teats of the cow 8 is 30 cm. A further advantage of this 
camera position is that the cameras will not be in the way of 
the robot arm 15 or the inlet gate 4. 

The camera pair 23 may be movable up and down, depending on the 
height of the udder, to attain a good contrast between the teats 
and background simply by moving the camera pair up or down. 

In the preferred embodiment of Fig. 2, a first 23 and a second 
24 camera pairs are directed towards the teats of the cow when 
being located in the position to be milked, wherein the each 
camera pair 23, 24 is provided to repeatedly record pairs of 
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images, and an image processing device is provided for 
repeatedly detecting the teats of the cow and determining their 
positions by a stereoscopic calculation method based on the 
pairs of images repeatedly recorded by -each of the camera pairs 

_ • 

23, 24. The -first camera pair 23 is arranged behind the cow, 
whereas the second camera pair is arranged at the side of the 
cow. A typical distance between the second camera pair 24 and 
the teats of the cow is 80 cm. 

At least two camera pairs may be necessary to obtain a high 
number of successful teat cup attachments. This is because the 
teats can occlude each other. More camera pairs may also relax 
the demands on the image processing. 

The preferred embodiment of Fig. 3 differs from the embodiment 
of Fig. 2 in that the second camera pair 24 is mounted at a 
height so that the teats of the cow will belong to the outer 
contour of the cow in the repeatedly recorded pairs of images 
by that camera pair 24. Such measures would facilitate image 
processing considerably. The second camera pair 24 is 
preferably movable vertically in order to be positioned at a 
height so that the teats of the cow will belong to the outer 
contour of the cow in the repeatedly recorded pairs of images. 

* 

Different cows have their udders at different heights and in 
order to secure that a teat contour is recorded the camera pair 
24 should be movable in the vertical plane. 

The preferred embodiments of Figs. 2-3 would not only improve 
the detection rate of the system but would also give more 
reliable stereo calculations since distance calculations from 
two independent systems would be used. 

The movable cameras could be trained for each cow or be trained 
the .first time a cow enters the milking station 3. The result 
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could be stored in a database and be updated after each 
milking. Thus, by having the possibility to move -the cameras of 
the camera pairs relative to the milking station 3 and to the 
cow when being located in a position to be milked, and 
optionally relative to each other, the number of failed teat 
position determinations due to teats being obscured can be 
reduced. 

The preferred embodiment of Fig. 4 differs from the embodiment 
of Fig. 3 in that each of the two camera pairs, here denoted by 
41 and 42, comprises three cameras 41a-c and 42a-c. By such a 
stereo vision system three stereoscopic position calculations 
can be made for each camera pair 41, 42, and thus in total six 
calculations can be . made provided that each camera 41a-c and 
42a-c can see the teat in question. It would of course be even 
better to use even more camera pairs to achieve even further 
stereoscopic position calculations, or at least render it most 
likely that two cameras in one camera pair can always detect a 
teat of a cow. 

According to yet a preferred embodiment of the invention, the 
camera pair is located straight below the cow's udder, and the 
cameras are directed upwards. For instance the camera pair may 
be located in the floor of the milking station behind a 
translucent protecting window provided with cleaning means such 
as wipers. The camera pair together with the protecting window 
could be raised by a telescopic arrangement to be positioned 
closer to the teats of the cow. Such camera pair would always be 
capable of detecting all four teats of the cow. However, the 
position is extremely vulnerable for dirt and damages. 

It shall be noted that the cameras of each camera pair in the 
preferred embodiments described above are advantageously 
arranged vertically one above the other. The image processing 
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device may then, for each -teat and for each pair of images, 
define .-the position of the lower tip of the teat contour in the 
pair of images as conjugate points for the stereoscopic 
calculation, and find the conjugate points along a substantially 
5 vertical epipofar line. This increases the accuracy and 
precision of the teat position determination considerably. This 
is discussed in detail in section 5 with reference to Figs. 6 
and ?• 

Further, the cameras of each camera pair are arranged so that 
10 the image planes of the cameras of each camera pair are 
coplanar. This is not a prerequisite for the stereoscopic 
calculations, but it facilitates the same. 

The cameras are typically video cameras or solid state cameras 
such as e.g. CCD cameras sensitive to visible light. Thus, 
15 neither the choice of location nor the indoor /outdoor lighting 
should affect the performance of the stereo vision system. 
Lightning is an issue of both suppressing natural light 
differences and enhancing the image quality to facilitate image 
analysis. 

20 The milking station 3 is typically arranged in a barn. 
Therefore, an artificial light source particularly provided for 
illuminating the udder of the cow to thereby increase the 
contrast in the repeatedly recorded pairs of images may be 
needed and used. 

25 In the preferred embodiment of Fig. 2, a light source 25 is 
provided, which may emit white light, colored light or UV 
light. In order to obtain a good contrast between the teats and 
the udder or other kind of background, the light source 25 
ought to create a back lighting or sidelight for each camera 

30 pair. In Fig. 2 the light source 25 is arranged at a low 
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position at the left side 3c of the milking station, but 
further ahead of the second camera pair 24. The light .is 
directed backwards and diagonally upwards toward the teats of 
the cow. Thus, a back lighting is obtained for the first camera 
5 pair 23 and a sidelight is obtained for the second camera pair 
24. 

The light source 25 could be movable and capable of being 
directed toward different directions in order to find an 
appropriate illumination of the teats of the cow. 

10 An external light source for the particular purpose of 
illuminating the teats of the cow is preferred but not 
compulsory. However, if such light source is not used, image 
analysis has to be used to counteract various lightning effects. 
Methods that can be used include e.g. histogram equalization, a 

15 method used to divide the grey levels equally within the entire 
range. However, since both color and texture of the teats are 
similar to the surface of the udder the contrast between them 
may in some instance be too low. 

Alternatively, or additionally, thermal cameras are provided for 
20 visualizing the natural infrared radiation from the cow (as 
compared to the radiation from surrounding objects). Thus, 
images of the temperature radiation emitted from objects in 
front of the cameras are captured. No external light sources are 
needed and problems with scattered and stray light are avoided. 
25 Further, the thermal cameras can be used during conditions of 
low levels of light and even in complete darkness, e.g. during 
night time. So-called active thermal cameras wherein reflected 
infrared or other radiation is used for visualization is a 
further option for the present invention. 



30 
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3, Image processing: teat detection 

Fundamental steps in image processing include the following 
different processing steps. • 

> Pre-processing 
5 > Segmentation 

> Representation and description 

> Classification 

The first step is pre-processing. This stage aims to enhance the 
image by removing noise or distortions. The main idea is to 

10 increase the chances to have success in the later processing 
steps by apply methods to suppress some data and to enhance 
other. Common techniques are transformations for enhanced 
contrast and noise removal. The transformations can be 
combinations of smoothing, different edge detectors or logical 

1 5 operations . 

Next, there is a group of processing techniques commonly 
referred to as segmentation. These techniques intend to divide a 
digital image into objects, i.e. "interesting areas", and 
background. 

20 Once the image is divided into objects and background the 
objects need to be described using descriptors. These 
descriptors could be size, grey levels, roundness etc. 
Simultaneously as the descriptors are calculated each object is 
given a unique identification number called a label. This step 

25 is referred to as representation and description. 

Finally, a classification algorithm is applied to the object 
list. This is a way to use object information to associate the 
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objects with objects in the milking station which have been 
imaged by the camera pair. Typically, some pre-knowledge of the 
images recorded is needed to create a good classification 
algorithm. The pre-knowiedge is important in image analysis, not 
5 just for the classification step. The more information available 
regarding a specific type of images, the better the image 
analysis can be made. 

In order to accurately detect teats, an automatic method has do 
be created. The accuracy of the detection system should 

10 preferably be no less than prior art systems of today.. To solve 
such a complex problem it is necessary to divide it into smaller 
processing steps described above. Fig. 5 shows various methods 
that can be used for teat detection and to which of the 
processing steps they belong. The most preferred methods are 

15 indicated in Fig. 5 and outlined below. 

Mean filtering is a blurring pre-processing method, which 
softens sharp areas in an image. If used properly it could 
remove noise from the image and potentially improve the image 
before it is processed by other methods. The method uses the 
20 spatial relationship of pixels to smooth an image. A filter mask 
of size M X N pixels is applied. A pixel, usually the center 
pixel, in that mask achieves a new pixel value, which represents 
the mean value of the pixel values within the mask. The bigger 
the filter mask the more the image is blurred. 

25 Motion detection is a pre-processing method for removal of 
superfluous information in the images by utilizing the cow's 
movement. Two images are recorded, after one another, with a 
small time difference, the fact that the grey level values will 
have changed were movement has occurred is utilized in the 

30 method- Since a cow never is stationary, because of breathing or 
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balance correction the time between the two consecutive images 
can be small. 

A black and white image is created, where white pixels represent 
important areas J i.e. areas where the cow's udder likely is) and 
black pixels represent uninteresting background areas. This is 
performed by changing pixels to white in areas were the absolute 
value of pixeli^ggi " P^^^^imagez exceeds a given constant and every 
other pixel is changed to black. Such a template image will be 
used to extract the interesting areas in the original image. 
Since the binary template image is scattered, it needs further 
processing to be useful. A good manner to remove scatter is to 
relax the image. A relaxation method gives the center pixel, in 
a neighborhood, the value of the most occurring value in that 
neighborhood. The larger the neighborhood is, the coarser the 
image becomes. 

Since it is critical to include the udder area in the final 
image it is a good idea to increase the white area in the 
template by a pixel expanding algorithm. This could be done by 
dividing the image into K x L squares, with an actual size of M 
x N and changing the pixels of the entire area to white if there 

■ * 

is any . occurrence of white pixels inside the area. 

If the teats are located in the outer part of the cow's contour, 
all other remaining information about the cow is here 
superfluous. This information is removed by applying a method, 
which is called contour creation. Contour creation firstly 
copies the black and white template image and then performs a 
number of binary erosions on the copy. The result is a smaller 
copy of the original, where every white area has shrunk. By 
subtracting the shrunken image from the original template image, 
the result is an image with white bands around the contour of 
the cow. 
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By combining -this band ±emplate image with the originally 
recorded image and exchanging every white pixel with its 
corresponding grey pixel, an image with a band of grey levels is 
created. This image is further processed. 

5 This last step can typically not be applied fully to images 
taken from the rear position since the teats in those images are 
not part of the outer contour of the cow. By only using the 
first three steps of the motion detection method background 
areas are removed from the image, which still reduces the amount 
10 of data considerably. 

Thus, according to the present invention, an image processing 
device is provided to perform any of the above mentioned steps 
to reduce the area in which the teats likely eire. 

Experiments have shown that the phase plays an important .role in... 

15 the perception of visual features. The most common features in 
images consist of combinations of steps, roofs and ramp profiles 
as well as mach band effects. There is no single linear feature 
detector that can detect those combinations. But by using the 
phase component of the Fourier Transform these features can be 

20 extracted. The phase plays a decisive role in the perception of 
visual features. If a human was asked to draw a sketch of the 
image, localizing precisely the edges or markings of interest as 
seen in the scene, then chosen points would be those where there 
is maximal order in the phase components of a frequency-based 

25 representation of the signal. Unfortunately, the phase 
congruency model is hard to implement but could be approximated 
with local energy in a signal. So instead of searching for local 
maxima in the phase congruency function, local maxima in the 
local energy function should be found. 
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The outputs of this method are two images, one containing 
magnitude and one orientation of the features. So the phase 
congruency feature detector is therefore suitable to be used in 
combination with a canny edge detection algorithm, see below, 
5 One issue that needs to be handled is that in combination with 
the motion detection method erroneous edges will be created 
where data has been removed. Since removed data is represented 
as black sections within the image, edges will occur in the 
contact area between the black edge and the important data area. 

10 Thus, according to the present invention, an image processing 
device is provided, for each time the teats are to be detected, 
to apply an edge detection algorithm based on the phase 
congruency model of feature detection to thereby find edges and 
corners in a recorded pair of images that most likely include 

15 those of the teats of the. cow. 

A preferred edge-detection method is a Canny detection 
algorithm- 
Next, a representation and description processing method is 
applied for calculating features of the edges and corners found 
20 in the recorded pair of images. Preferably, a labeling 
algorithm, -such as a connected-components labeling method, and a 
feature vector method are used. 

Finally, a classification method is applied for identifying 
edges and corners of the found edges and corners in the recorded 
25 pair of images which belong to the teats of the cow based on the 
calculated features. The classification method includes 
typically a low-level classification algorithm and a high-level 
classification algorithm such as a hierarchical chamfer matching 
algorithm. 



wo HI05/094565 PCT/SE2005/(MM)427 

18 

4. Image processing: calibration 

If stereo vision is to be useful, calibration is a decisive 
component. Stereo vision or stereoscopic calculation strongly 
relies on accuracy in calibration and measurements. 

The basic idea is built on the ideal case: a pinhole camera that 
is perfectly aligned. This is not the case in reality. To 
calculate scene point coordinates, additional information is 
needed. There are the camera constant and the baseline distance. 
In reality the relative position of the cameras differs more 
than just the baseline displacement. The camera also has other 
important parameters other than the camera constant. The image 
center can be displaced and there are probably many types of 
distortions in the image. There are for example radial 
distortion and de-centering distortion. Radial distortion is 
seen as a scene point being imaged closer or • further out from 
the image centre than it should have been. This distortion is 
dependent on the distance from the image center. 

The camera intrinsic parameters are collected with the Interior 
calibration. This gives the actual image center and other 
distortions parameters. These are calculated to be able to 
reconstruct the actual image. 

Before the resulting images can be used and distance 
calculations can be done, the relation between the camera 
positions has to be determined. This is achieved by relative 
calibration. Here both parameters that describe the rotation and 
displacement are estimated. 

If the calculated distance to a scene point should be useful, 
its relation to another coordinate system is significant. The 
aJbsolLzte calibration is used to determine this relation, both 
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rotation and transformation, between two coordinate .systems. In 
this case the relation between a stereo vision system and the 
robot coordinate system is of interest. 

It . has been found experimentally that absolute calibration is 
5 much more difficult than relative calibration, and that much 
more reference points in the milking station are needed. 
Further, the stereo vision system must be continuously 
calibrated due to movements and other disturbances, which do 
occur in the milking station, and which affects calibration 

10 points placed therein. The accuracy of the calibration points 
; must be 10 times higher then the accuracy demand on -the teat 

positioning* The parameters needed in the stereo calculation, is 
the position and rotation of both cameras in the milking station 
coordinate system, the image center and the radial and de- 

15 ■ centering distortion factors. 

Due to the difficulties obtained when performing absolute 
calibration to obtain highly accurate absolute teat positions, 
the present invention proposes a manner of operation, which 
remedies this problem, 

i 

y 

20 An image processing device is provided in an initial stage of a 
teat cup. attachment to roughly determine the absolute position 
of a teat of the cow in a coordinate system of the milking 
station, by aid of which the robot may fetch a teat cup and move 
it to a position close to a teat of the cow present in the 

25 milking station. The image processing device is then provided, 
i.e. in a later stage of the teat cup attachment when the robot 
arm is close to attach the teat cup to the teat of the cow, to 
repeatedly and accurately determine the position of the teat of 
the cow relative to the robot arm or the teat cup. Thus, the 

30 position of the teat of the cow relative to the robot arm or the 
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-beat cup is more exact ±han .what the absolute position can be 

* 

measured. 

The image processing device is provided, in the later stage when 
the robot arm is^.close to attach the teat cup to the .teat of the 
cow, to repeatedly determine the relative position of the teat 
of the cow in a coordinate system of the camera pair used, and 
to repeatedly detect the robot arm or the teat cup and determine 
its relative position in the coordinate system of the camera 
pair by the stereoscopic calculation method of the present 
invention • 

During each absolute calibration, the camera pair is provided to 
record a pair of images, wherein several well defined points are 
located in the common field of vision of the camera pair, the 
positions of the well defined points being known in the 
coordinate system, of the milking station, and the image 
processing device is provided to perform an absolute 
calibration process, in which the positions of the image planes 
of the cameras of the camera pair are determined in the 
coordinate system of the milking system to thereby be capable 
of determining the absolute position of the teat of the cow. 

Similarly, during each relative calibration, the camera pair is 
provided to record a pair of images, wherein several well 
defined points are located in the common field of vision of the 
camera pair, and the image processing device is provided to 
perform a relative calibration process, in which the positions 
of the image planes of the cameras of the camera pair are 
determined relative to each other to thereby be capable of 
determining the position of the teat of the cow relative to 
another measured position. 
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Advantageous by -this process is that a very .fast and accurate 

teat cup attachment is enabled, while no extremely high demands 
are put on the calibration process , 

5, Image processing: stereo calculation 

5 Once the teats of the cow have been detected their positions 
have to be determined by a stereo vision or stereoscopic 
calculation method. Stereo vision is the recovery of three- 
dimensional information of a scene from multiple images of the 
same scene. Stereo vision in computer systems tries to copy the 

10 human way when calculating distance. The principle is based on 
triangulation . Points on the surface of objects are imaged in 
different relative positions in the images recorded depending 
on their distance from the viewing system. The basic model is 
two ideal pinhole cameras viewing the same scene, only with a 

15 baseline distance separating them. The image plane should 
preferably be coplanar, which means that there is no relative 
rotation between the cameras . 

In order to compute the best baseline distance for a system a 
number of parameters are to be considered. Camera positions, 

20 type of cameras, image area, relative positions, milking 
station constraints and distance-accuracy demands. The baseline 
is therefore a trade-off of the mentioned parameters. 
Measurements have shown that a baseline distance of about 5-20 
cm would be a suitable choice. Further investigations have 

25 shown that the baseline distance should be about 1/5 of the 
distance from the cameras to the object of interest. The camera 
pair placed on the side of the milking station would then have 
a baseline distance of approximately 80/5 cm = 16 cm. The rear 
camera pair behind the cow is closer to the object and should 

30 have a baseline distance of about 30/5 cm = 6 cm. If the 
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disparity increases the probability of finding and detecting 
the conjugate pair decreases. 

One scene point is imaged • in different locations in a first 
image and a s^econd image recorded by differently placed 
cameras . The scene point in the first image plane and in the 
second image plane are both placed on the epipolar line and is 
called a conjugate pair or conjugate point. The epipolar line 
is the line that connects the scene point in the first and 
second images- If the scene point is found in the first image, 
the scene point in the second image is on the epipolar line. In 
the ideal model this means that if the cameras are • arranged 
horizontally side by side the first and second image points are 
on the same image row. Conjugate points are a pair of image 
coordinates, one from each image, which are each a projection 
from the same point in space onto the respective image plane. 
The displacement between the image points is called the 
disparity and will finally give the depth information. 

For the stereo vision calculations the choice of the points 
that constitutes the conjugate pair is important. They have to 
represent interesting measuring points and should not result 
bring any additional error to the distance errors. The 
conjugate points should also be easy to find. 

According to the invention the position of the lower tip of the 
teat contour in the pair of images is defined as conjugate 
points . 

The usual manner to model a stereo vision system is to have a 
horizontal displacement of the cameras. Fig. 6 shows a pair of 
images 91a-b as taken by a camera pair mounted horizontally 
side by side. The contour images of the teats are denoted 92a 
and 93a in the left hand image 91a, and 92b and 93b in the 
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right hand image 91b. "The epipolar JLines .are denoted by 94 .for 
the teat images 92a-b and by 95 for the -teat images 93a-b. This 
would result in difficulties to find the conjugate points 
because the lines representing the lower tips of the teat 
contours are essentially horizontal and the epipolar lines are 
horizontal. Since these lines are almost parallel it is 
difficult to select a correct conjugate point. 

According to a preferred embodiment of the present invention, 
the cameras of each camera pair are arranged vertically one 
above the other as has been disclosed in section 2 above. Fig. 
7 shows a pair of images lOla-b, as taken by a camera pair 
mounted vertically one above the other. The contour images of 
the teats are denoted 102a and 103a in the upper image 101a, 
and 102b and 103b in the lower image 91b. The epipolar lines 
are denoted by . 104 for the teat images 102a-b and by 105 for 
the teat images 103a-b. This results in difficulties to find 
the conjugate points because the lines representing the lower 
tips of the teat contours are essentially horizontal and the 
epipolar lines are horizontal. Since these lines are almost 
parallel it is difficult to select a correct conjugate point. 

Through this arrangement of the cameras, the epipolar lines 
become orthogonal to the lines representing the lower tips of 
the teat contours. This will increase the accuracy and 
simplifies the conjugate point detection. 

Additional information or conjugate points may be needed. This 
can be achieved by selecting further points on the lines 
representing the lower tips of the teat contours. The teats 
internal orientation will indicate on which teat line the 
conjugate point is to be found. The thickness and teat angle 
are two interesting parameters. These parameters should not be 
difficult to extract from .the teat tip lines. 



wo 2005/094565 PCT/SE2005/000427 

In stereovision systems different kinds of occlusion occur. 
Depending on the cameras relative positions the two images .from 
a stereo-camera pair differ. One teat can be visible in one 
image but bee partly or fully occluded in the other stereo- 
5 image pair. Another issue is the fact that edges from a teat 
visualized in two images, are in fact not the same physical 
edge . 

Fig. 8 is a side view of the lower part of a teat of a milking 
animal illustrating self -occlusion. The teat tip 111 is modeled 

10 as a half sphere. Two projection lines 112a and 112b from 
cameras at different heights illustrate the fact that the lower 
tip of the teat contour in the images recorded by the cameras at 
different height does not correspond to the same physical scene 
point on the teat of the cow. In the image by the upper camera 

15 point 113a will be recorded as the lower tip of the teat 

■ T ' " •■ ■ 

contour, whereas in the image by the lower camera point 113b 
will be recorded as the lower tip of the teat contour. This 
leads inevitably to errors in the calculation of the positions 
of the teats. 

20 The phenomenon which may be compensated for, by still a 
preferred embodiment of the invention. An image processing 
device can be provided, for each teat and for each pair of 
images, to compensate for any deviations caused by the fact that 
the lower tip of the teat contour in the pair of images 

25 corresponds to different objection or scene points on the teat 
due to different perspective views, in which the pair of images 
are recorded, by means of creating a mathematical model of the 
characteristic form of the teat, and to calculate the 
compensation based on the mathematical model, the different 

30 perspectives, and the distance to the teat. 
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The exact distance to the teat .is typically unknown, whereupon 
the image processing device can be provided to determine the 
positions of the teats of the milking animal, and to calculate 
the compensation for any deviations caused by the fact that the 
5 lower tip of the' teat contour in the pair of images corresponds 
to different objection points on the teat, in an iterative 
process, wherein better teat positions and better compensation 
can be obtained for each iteration. 

6. Image processing; further functions 

10 When using a stereo vision camera system it is possible to 
expand the functionality to more than locating teats and robot 
arms or teat cups. For instance, an image processing device can 
be provided to automatically detect injuries and/or dirt on the 
teats of the cow by an image processing method based on one or 

15 several of the repeatedly recorded pairs of images. A single 
image could be used to discover visual defects, such as redness, 
cuts, sores or wounds, on the teats and the udders of the cows.. 
The camera system may also surveillance the milking process or 
inspect almost any part of the milking station. This would be 
,20 easier if the cameras could be rotated independently of each 
other . 

Furthermore, the teat cleaning device of the milking' station 
may be designed to be capable of cleaning teats of cows 
according to anyone of a plurality of different teat cleaning 

25 schemes, and then one of the plurality of different teat 
cleaning schemes can be selected for the cleaning of the teats 
of the cow based on the automatic detection of injuries and/or 
dirt on the teats of the cow. Teat cleaning can in this manner 
not only be performed on a cow individual basis but also on a 

30 teat individual basis. 
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The stereo vision system can also be used to create 3D images of 
objects. In those images the teat thickness and deformations can 
be analyzed. 

Further, the stereo vision system can be used when a technician 
monitors the milking station, or inspects the milking station 
for errors. Thus, the stereo vision system can operate as a 
simple surveillance system. It can easily detect any movement 
around the robot and trespassers could be detected. 

The stereo vision system is vulnerable to dirt or damages on the 
lenses of the cameras. By using image information dirt could be 
detected and allow for the system to attend to the problem. 

By recording an image of the milking station e.g. when it is 
empty, a good reference image is created. The image can then be 
compared with later images. If the images do not differ more 
than to a given extent, no dirt has settled on the lens. This 
method is suitable for detecting lumps of dirt but it will 
probably not work well when it comes to detect slowly smeared 
lenses. Since this kind of dirt seldom appears suddenly, but is 
rather a slow deterioration in image quality. However, a smeared 
image could be compared to a strong blurring filter: the nximber 
of edges found in an image would be smaller the blurrier the 
lens is. Thus, by using images of the empty milking station it 
is probably possible to discover lumps of dirt and smeared areas 
by applying two different techniques. 

If the cameras can be rotated around their own axes it is also 
possible to rotate them into such positions where they could 
detect dirt on each other. A lens is an easy object to 
automatically detect and it is possible to create a method for 
detecting dirt on an otherwise very clean surface. 



